Aside

headshot

Contact

Education

Technical Skills

Strategic Skills

Toolset

Main

Christopher Smith

Data science professional with 8 years of experience empowering organizations with data and insights. My expertise covers multiple areas: defining metrics and extracting insights with imperfect data; general software development including within an object-oriented framework; and the complete machine learning lifecycle from data extraction to deployment.

Professional Experience

Senior Machine Learning Analyst

Google

Mountain View, CA

2022 - 2019

  • Built algorithmic protections using machine learning and general programming to remove unwanted content from Google platforms at scale (>10 billion texts, images, videos, and URLs scanned per day)
  • Collaborated w/ stakeholders to build, tune, and launch models, including some of the largest product teams within Google.
  • Awarded Google Counter Abuse Innovation Award and 10 “Peer Bonuses” given by colleagues
  • Spam Text Classifier: Trained and evaluated ML model end-to-end. Engaged w/ partner team to build dataset. Used NLTK and internal text libraries to clean input text and compute features like part-of-speech. Model achieved 31% improved recall @ 95% precision compared to the prior model.
  • Phishing Campaign Prevention: Key analyst for a dedicated phishing prevention team (e.g., login credential theft). Ran investigations into false-negatives (misses) using SQL+python, addressed phishing attacks on Google products in real time. Implemented anti-obfuscation web-crawls for a large Google team leading to ~160% increase in phishing warnings
  • Web Page Abuse ML Classifier: Identified and pulled data, iterated on features, trained boosted-trees-classifier in Tensorflow, ran live traffic eval (python), and worked with Engineers to deploy the model. ~620K incremental URLs/week flagged for takedown, and ~130K incremental URLs flagged for manual-review representing ~30% increase in coverage.

Modeling & Analytics Associate Manager

Accenture Federal Services

Alexandria, VA

2019 - 2015

  • Built machine learning driven solutions to support contracts with multiple large government agencies.
  • Key member in two technical demonstrations, contributing significantly to contract wins in excess of $20 million. Demos required data analysis and high-speed coding with intense time pressure.
  • ML Plagiarism Detection: Co-led team to automate detection of repeated themes/phrases across thousands of documents, supporting fraud investigations within the first week of field use. Applied OOP principles to architect the overall model framework in Python. Developed binary classifier with >90% precision identifying client labeled fraud. Loaded results to ElasticSearch in AWS for client consumption.
  • Biographic Entity Resolution: Designed and built an entity resolution framework in Java for probabilistic matching of biographic records (names, birthdates, document #s, etc.) in search. Applied text similarity techniques including: edit distance, longest common substring, and double metaphone encoding. Launched 3 search relevancy models in Java within microservices leading to 30%, 89%, and 50% reductions in manual workload respectively for critical government verification processes.

Quantitative Analyst

Agilex Technologies

Alexandria, VA

2015 - 2014

  • Identification of criminal activity in networks: Applied concepts from network theory such as shortest-paths, node centrality, and neighborhood detection to identify risky entities in an interaction graph. Analysis was done using network analysis libraries: igraph (R-lang) and JgraphT (Java).